Misc News

I'm back... Going through my RSS feeds, two items caught my attention:

Tim Bray: Charles Nutter and Thomas Enebo, better known as “The JRuby Guys”, are joining Sun this month.

Jon Udell: Why argue about dynamic versus static languages when you can use both? Which discusses, among other things, why the first three versions of the IronPython compiler were written in Python, but today it's written in C#.

Topology in Programming Language Semantics

A recent story over at Ars Mathematica reminded me that I have seen a lot of interesting work in applying topology to programming language semantics.

The paper linked-to on Ars Mathematica is more about applications to software engineering (precise notion of refinement, but since implementations are the ultimate refinement of a specification, this is quite relevant to PLs as well). But there is a lot more work in this area! For those with a theoretical bent, there are a series of articles by John Tucker and Jeffery (Jeff) Zucker, for example Abstract versus Concrete Computation on Metric Partial Algebras (many more available from their respective web pages). Another thread that I like is the work of Abbas Edalat; he has written many papers relating topology, domain theory and computations in analysis. I am particularly fond of the work of Martin Escardo; the lecture notes on Mathematical foundations of functional programming with real numbers might interest a few people here. But as far as I am concerned (and Haskell fans might agree), his Magnum Opus is Synthetic topology of data types and classical spaces.

Vacation

I am going to be away until the end of next week. I hope you enjoy yourselves while I am gone.

Gradual Typing for Functional Languages

Gradual Typing for Functional Languages

Static and dynamic type systems have well-known strengths and weaknesses, and each is better suited for different programming tasks. There have been many efforts to integrate static and dynamic typing and thereby combine the benefits of both typing disciplines in the same language. The flexibility of static typing can be improved by adding a type Dynamic and a typecase form. The safety and performance of dynamic typing can be improved by adding optional type annotations or by performing type inference (as in soft typing). However, there has been little formal work on type systems that allow a programmer-controlled migration between dynamic and static typing. Thatte proposed Quasi-Static Typing, but it does not statically catch all type errors in completely annotated programs. Anderson and Drossopoulou defined a nominal type system for an object-oriented language with optional type annotations. However, developing a sound, gradual type system for functional languages with structural types is an open problem.

In this paper we present a solution based on the intuition that the structure of a type may be partially known/unknown at compile-time and the job of the type system is to catch incompatibilities between the known parts of types. We define the static and dynamic semantics of a λ-calculus with optional type annotations and we prove that its type system is sound with respect to the simply-typed λ-calculus for fully-annotated terms. We prove that this calculus is type safe and that the cost of dynamism is “pay-as-you-go”.

In other news, the Holy Grail has been found. Film at 11.

This piece of genius is the combined work of Jeremy Siek, of Boost fame, and Walid Taha, of MetaOCaml fame. The formalization of their work in Isabelle/Isar can be found here.

I found this while tracking down the relocated Concoqtion paper. In that process, I also found Jeremy Siek's other new papers, including his "Semantic Analysis of C++ Templates" and "Concepts: Linguistic Support for Generic Programming in C++." Just visit Siek's home page and read all of his new papers, each of which is worth a story here.

Reflective Program Generation with Patterns

Reflective Program Generation with Patterns. Manuel Fähndrich, Michael Carbin, James R. Larus. October 2006.

Runtime reflection facilities, as present in Java and .NET, are powerful mechanisms for inspecting existing code and metadata, as well as generating new code and metadata on the fly. Such power does come at a high price though. The runtime reflection support in Java and .NET imposes a cost on all programs, whether they use reflection or not, simply by the necessity of keeping all metadata around and the inability to optimize code because of future possible code changes. A second---often overlooked---cost is the difficulty of writing correct reflection code to inspect or emit new metadata and code and the risk that the emitted code is not well-formed. In this paper we examine a subclass of problems that can be addressed using a simpler mechanism than runtime reflection, which we call compile-time reflection.We argue for a high-level construct called a transform that allows programmers to write inspection and generation code in a pattern matching and template style, avoiding at the same time the complexities of reflection APIs and providing the benefits of staged compilation in that the generated code and metadata is known to be well-formed and type safe ahead of time.

Macros, multi-staged programming etc. are the appropriate buzzowrds.

LtU readers will probably be interested in the STM example (see sec. 7.1)

Lightweight Static Capabilitites (II)

The slides for the talk discussed here are now available for download. Like all of Ken and Oleg's work, this stuff is both cool and important.

Keep in mind that the talk is quite different from the paper. The safety claims were formalized and proved in Twelf: list example, array example. To follow the proofs you should read them alongside the presentation slides. I am told that the first file might change soon, to reflect a more general proof. Perhaps Ken or Oleg would like to comment on the experience of doing mechanized proofs, a subject the comes up regularly on LtU.

LtU newcomers, who already managed to take in a little Haskell or ML, may want to spend a little time chewing on this, and ask questions if something remains unclear, since this work may eventually have practical importance, and can teach quite a few interesting techniques.

The Daikon Invariant Detector

Daikon

Daikon is an implementation of dynamic detection of likely invariants; that is, the Daikon invariant detector reports likely program invariants. An invariant is a property that holds at a certain point or points in a program; these are often seen in assert statements, documentation, and formal specifications. Invariants can be useful in program understanding and a host of other applications. Examples include “.field > abs(y)”; “y = 2*x+3”; “array a is sorted”; “for all list objects lst, lst.next.prev = lst”; “for all treenode objects n, n.left.value < n.right.value”; “p != null => p.content in myArray”; and many more. You can extend Daikon to add new properties (see Enhancing Daikon output).

Dynamic invariant detection runs a program, observes the values that the program computes, and then reports properties that were true over the observed executions.

Daikon can detect properties in C, C++, Java, Perl, and IOA programs; in spreadsheet files; and in other data sources. (Dynamic invariant detection is a machine learning technique that can be applied to arbitrary data.) It is easy to extend Daikon to other applications; as one example, an interface exists to the Java PathFinder model checker.

I spend a lot of time here talking about static typing, but let's face it: most often we're dealing with existing code that probably isn't written in a language with a very expressive type system, and rarely has been formally specified, whether through an expressive type system or otherwise. Daikon is interesting because it attempts to learn important properties of a piece of software by execution and observation. Combine it with a model checker like Java PathFinder, and you have an unusually powerful means of evolving the correctness of even quite complex code. There's also a relationship to the already-mentioned JML and ESC/Java 2, which in turn has a connection to the popular JUnit unit-testing framework. In short, the gap between compile time and runtime, and static vs. dynamic typing, seems to be narrowed in powerful ways by these, and related, tools.

Declarative Networking: Language, Execution and Optimization

Declarative Networking: Language, Execution and Optimization, Boon Thau Loo, Tyson Condie, Minos Garofalakis, David A. Gay, Joseph M. Hellerstein, Petros Maniatis, Raghu Ramakrishnan, Timothy Roscoe and Ion Stoica.

First, we motivate and formally define the Network Datalog (NDlog) language for declarative network specifications.
Second, we introduce and prove correct relaxed versions of the traditional semi-naive query evaluation technique, to overcome fundamental problems of the traditional technique in an asynchronous distributed setting. Third, we consider the dynamics of network state, and formalize the “eventual consistency” of our programs even when bursts of updates can arrive in the midst of query execution. Fourth, we present a number of query optimization opportunities that arise in the declarative networking context, including applications of traditional techniques as well as new optimizations. Last, we present evaluation results of the above ideas implemented in our P2 declarative networking system, running on 100 machines over the Emulab network testbed.

I will be the first to admit that I somehow fundamentally do not get the logic programming style, but presenting a routing discovery protocol in about eight lines of code is pretty cool.

ESC not just for Java any more

Dana N. Xu's ESC/Haskell:
I this paper, we describe an extended static checking tool for Haskell, named ESC/Haskell, that is based on symbolic computation and assisted by a few novel properties. One novelty is our use of Haskell as the specification language itself for pre/post conditions.

Verified Software: Theories, Tools, and Experiments

Verified Software: Theories, Tools, and Experiments, VSTTE 2006, Workshop proceedings. K. Rustan M. Leino; Wolfram Schulte. August 2006.

The papers are:

  • Formalising a High-Performance Microkernel written in Haskell, using Isabelle/HOL.
  • Automated Verification of UPC Memory Consistency. UPC is a shared memory extension of C.
  • Automated Model-based Verification of Object-Oriented Code, describing ESpec which is a suite of tools that facilitate the testing and verification of Eiffel programs.
  • Cross-Verification of JML Tools: An ESC/Java2 Case Study which concludes that the use of JML RAC uncovered desgin flaws that ESC/Java2 was unable to report.
  • Static Stability Analysis of Embedded, Autocoded Software.